Skip to content

fix: non-breaking detection for redundant cast projections#5851

Open
albertosuman-1k5 wants to merge 3 commits into
SQLMesh:mainfrom
1K5-TECH:new-main
Open

fix: non-breaking detection for redundant cast projections#5851
albertosuman-1k5 wants to merge 3 commits into
SQLMesh:mainfrom
1K5-TECH:new-main

Conversation

@albertosuman-1k5

Copy link
Copy Markdown
Contributor

Description

Summary

  • Fixes a false-positive Breaking categorization when adding a mid-list projection with a redundant same-type cast, e.g. new_col::STRING above an existing ::STRING projection.
  • Adds a conservative projection-additivity fallback for SqlModel.is_breaking_change() when SQLGlot’s AST diff emits spurious Move / Update edits.
  • Preserves conservative behavior for real changes such as projection rewrites, removals, reorders, filter/group changes, and cardinality-changing UDTFs.

Context

SQLGlot’s tree diff can cross-match structurally similar cast nodes, especially identical DataType leaves such as STRING. This can make a purely additive projection change look like a non-Insert diff, causing SQLMesh to return None from is_breaking_change(). Under AutoCategorizationMode.FULL, that becomes BREAKING.

The new fallback verifies directly that:

  • both rendered queries are top-level SELECTs,
  • the query skeleton is unchanged except for the projection list,
  • previous projections are preserved in order,
  • and added projections do not include cardinality-changing UDTFs.

Test plan

  • python -m pytest tests/core/test_snapshot.py -k "categorize_change_sql or categorize_change_seed" -q
  • python -m pytest cast_breaking_change_demo/test_cast_breaking_change.py -q
  • python -m pytest tests/core/test_snapshot.py tests/core/test_model.py::test_is_breaking_change tests/core/test_plan.py -q
  • pre-commit run --files sqlmesh/core/model/definition.py tests/core/test_snapshot.py cast_breaking_change_demo/test_cast_breaking_change.py

Checklist

  • I have run make style and fixed any issues
  • I have added tests for my changes (if applicable)
  • All existing tests pass (make fast-test)
  • My commits are signed off (git commit -s) per the DCO

@albertosuman-1k5

Copy link
Copy Markdown
Contributor Author

Can we trigger again the workflows? Pretty sure the first run failed because of the issue described in #5852

@mday-io

mday-io commented Jun 23, 2026

Copy link
Copy Markdown
Collaborator

Can we trigger again the workflows? Pretty sure the first run failed because of the issue described in #5852

rerunning...

@mday-io mday-io self-requested a review June 23, 2026 13:50
# Everything other than the projection list must be structurally identical. Replacing each
# SELECT list with the same dummy literal lets the expression equality check focus on the
# FROM / WHERE / GROUP BY / ORDER BY / etc. parts of the query.
previous_skeleton = previous_query.copy()

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fallback can incorrectly treat projection inserts as non-breaking when the query uses positional references outside the SELECT list.

For example, this is currently categorized as NON_BREAKING on this branch:

SELECT a::DATE, s::TEXT FROM t ORDER BY 2
->
SELECT a::DATE, x::TEXT, s::TEXT FROM t ORDER BY 2

But ORDER BY 2 changes from sorting by s to sorting by x. The same issue applies to GROUP BY 2. Replacing both select lists with a dummy literal makes the skeleton comparison pass even though ordinal references still depend on the original projection positions.


# Adding a UDTF projection (e.g. EXPLODE / UNNEST) can change row cardinality, so such a
# change is not safely non-breaking even when it appears as an extra SELECT item.
for projection in this_projections:

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This UDTF guard misses aliased UDTF projections because EXPLODE(y) AS y is an exp.Alias, not an exp.UDTF.

When the fallback is triggered by same-type casts, this branch can categorize a change like this as NON_BREAKING:

SELECT a::DATE AS a, s::TEXT AS s FROM t
->
SELECT a::DATE AS a, x::TEXT AS x, EXPLODE(y) AS y, s::TEXT AS s FROM t

Upstream returns None for this case. The check probably needs to inspect newly added projections for nested UDTFs, while still allowing UDTFs inside scalar subqueries as the existing tests expect.

@mday-io

mday-io commented Jun 23, 2026

Copy link
Copy Markdown
Collaborator

Could you add tests for queries that reference SELECT-list positions from other clauses, such as ordinal ORDER BY / GROUP BY, to ensure those stay undetermined instead of being auto-categorized as non-breaking?

It would also be good to add coverage for aliased UDTF projections, since those may not appear as a top-level UDTF expression in the projection list but should still remain conservative.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants